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Attached are a marked-up copy of the originally filed specification and a 

clean substitute specification in accordance with 37 C.F.R. §§ 1.121(b) (3) and 

1.125(c). The substitute specification contains no new matter. 

Substitute Specification 

TITLE OF THE INVENTION 

Round Key Generation for AES Riindael Block Cipher 

CROSS-REFERENCE TO RELATED APPLICATIONS 

None 

BACKGROUND OF THE INVENTION 

[0001] The present invention relates to methods and apparatus for implementation 

of the Advanced Encryption Standard (AES) algorithm and in particular to methods 
and apparatus for real-time generation of the round keys required during the 
encryption and decryption rounds of the algorithm. 

[0002] The invention has particular, though not exclusive, application in 

cryptographic devices^ such as those installed in smart cards and other devices 

where processor and memory resources are limited. 
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[0003] The AES (Rijndael) algorithm may be implemented using a 128-bit, a 192- 
bit^ or a 256-bit key operating on successive 128-bit blocks of input data. During 
implementation of an encryption operation or a decryption operation according to 
the AES algorithm (hereinafter, generally a "cryptographic operation"), the original 
or "initial" key must be expanded to provide a round key for each successive round 
of the encryption or decryption operation. The number of rounds (Nr) is 10 for 128- 
bit keys, 12 for 192-bit keys, and 14 for 256-bit keys. 

[0004] Thus, the expanded round key is the size of the initial key multiplied by 
(Nr+i). In the case of a 128-bit key, the expanded key comprises 128 x 11=1408 bits; 
for the 192-bit key, the expanded key comprises 128 x 13=1664 bits; and for the 256- 
bit key, the expanded key comprises 128 x 15=1920 bits. 

[0005] Storage of this expanded key consumes a significant amount of memory 
space in cryptographic engines, which is particularly significant in certain 
applications, such as the provision of cryptographic engines on smartcards and the 
like where memory space is limited. Provision of this space is not strictly necessary 
if round keys can be generated during operation of the cryptographic engine without 
causing delay thereto. 

BRIEF SUMMARY OF THE INVENTION 

[0006] The present invention is directed towards a key expansion method and 

apparatus to implement the round key generation function in real time using a 
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substantially reduced memory allocation than existing techniques. 

[0007] The present invention r oe o g nisos recognizes that real time generation of the 

successive round keys can be performed in parallel with execution of the encr3T)tion 

or decryption algorithm in the cryptographic engine^ aadr-have little impact on the 

execution time of the encryption or decryption procesSj, and with h ave r educed 

amounts of hardware. 

[0008] According to one aspect, the present invention provides a method of 
generating successive round keys of an expanded key from an initial cryptographic 
key for use in an encryption and/or decryption engine, comprising the steps of: 
[0009] storing the Nk words of the initial key in Nk locations of a memory; 
[0010] providing the initial key to a cryptographic engine for performing a first 
cryptographic round; 

[0011] repeatedly retrieving a selected first word and a selected second word of the 
expanded key, at least one of which is retrieved from the memory, and generating 
from the selected first and second words a successive subsequent word of the 

expanded key; 

[0012] providing the generated words of the expanded key to the cryptographic 

engine as round keys for performing subsequent cryptographic rounds; and 

[0013] storing successive ones of the generated subsequent words in the memory by 

cyclically overwriting previously generated words of the expanded key. 

[0014] According to another aspect, the present invention provides a round key 
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generator for generating successive round keys of an expanded key from an initial 

cryptographic key for use in an encryption and/or decryption engine, comprising: 

[0015] a memory for storing the Nk words of the initial key; 

[0016] an expansion processor for repeatedly retrieving a selected first word and a 

selected second word of the expanded key, at least one of which is retrieved from the 

memory, and generating from the selected first and second words a successive 

subsequent word of the expanded key; 

[0017] means for providing the generated words of the expanded key to the 
cryptographic engine as round keys for performing subsequent cryptographic 
rounds; and 

[0018] means for storing successive ones of the generated subsequent words in the 
memory by cyclically overwriting previously generated words of the expanded key. 
[0019] According to another aspect, the present invention provides an AES round 
constant function generator comprising a shift register having: 
[0020] a first control input for causing a left shift of the register contents; 
[0021] a second control input for causing a right shift of the register contents; and 
[0022] a third control input for causing a preset of the shift register contents to one 
of several possible values. 
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BRIEF DESCRIPTION OF THE DRAWI NGS 

[0023] Embodiments of the present invention will now be described by way of 
example and with reference to the accompanying drawings in which: 
[0024] FIG. 1 is a flow diagram illustrating implementation of an encryption 
operation using the AES block cipher algorithm; 

[0025] FIG. 2 is a flow chart of the AES round key schedule used to generate the 
expanded encryption key that provides the plural round keys required during an 
encryption operation; 

[0026] FIG. 3 is a schematic block diagram of a round key generator according to 
the present invention; 

[0027] FIG. 4 is a schematic block diagram of the key expansion processor for 
generating the succession of round keys during encryption; and 
[0028] FIG. 5 is a schematic block diagram of the key expansion processor for 
generating the succession of round keys during decrj^tion, 

DETAILED DESCRIPTION OF THE INVENTION 

[0029] The AES algorithm for encryption of plaintext to ciphertext is shown in FIG. 
1. The AES algorithm may be implemented using a 128-bit, a 192-bit. or a 256-bit 
key operating on successive 128-bit blocks of input data. FIG. 1 will now be 
described in the context of the basic implementation using a 128-bit key. 
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[0030] An initial 128-bit block of input plaintext 10 is XOR-combined 11 with an 

original 128-bit key 12 in an initial round 15. The output 13 from this initial round 

15 is then passed through a number of repeated transform stages, in an encryption 

round 28 which i n clude s include the SubBvtcs Sub Bytes transform 20, the 

ShiftRows Shift Rows transform 21^ and the MixColumns Mix Columns transform 

22 in accordance with the defined AES algorithm. 

[0031] The output from the Mi- xColumns Mix Columns transform 22 is XOR- 
combined 23 with a new 128-bit round key 26, which has been derived from the 
initial key 12. The output from this XOR-combination is fed back fedback to pass 
through the encryption round 28 a number of further times. 

[0032] For each successive iteration through the encryption round 28, a new round 
key 26 is derived from the existing round key 26 according to the AES round key 
schedule. 

[0033] The number of iterations (Nr-i) of the encryption round 28 is 9 where a 128- 
bit encryption key is being used, 11 where a 192-bit encryption key is being used, 

and 13 where a 256-bit encryption key is being used. 

[0034] After the requisite number (Nr-i) of encryption r ounds 28, a final round, Nr, is 
entered under the control of decision box 24. The final round 30 comprises a further 

SubBytos Sub Bytes t ransform 31, a further ShiftRows Shift Rows transform 32, 

and a subsequent XOR-combination 33 of the result with a final round key 36 
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generated 35 from the previous round key. The output therefrom comprises the 



[0035] It will be noted from FIG. 1 that the implementation of the AES encryption 
algorithm requires generation of new round keys 25 from the initial key 12 ready 
for each round 28, 30. 

[0036] Throughout the present specification, the keys will be e xprosGi oB -expressed 
in terms of the number, Nk, of 32-bit words. For an initial 128-bit encryption key 12, 
ie. 4x32-bit words, Nk=4, and the "expanded" key comprises 11x4 32-bit words, or 44 
words, written as W (0) ... W (43). For an initial 192-bit encryption key (Nk=6), the 
expanded key rises to 13x4 32-bit words, or 52 words, written as W(0) . . . W(52). 
For an initial 256-bit encryption key (Nk=8), the expanded key rises to 15x4 32-bit 
words, or 60 words, written as W(0) . . . W(59). 

[0037] During execution of the AES decryption algorithm, the round keys are the 
same as for encryption, but presented in the reverse order. 

[0038] With reference to FIG. 2, the general AES key expansion algorithm for 

generating the successive round keys will now be described, in the context of a 128- 
bit key (number of words in the key, Nk=4). It will be understood that the technique 
also applies to 192-bit (Nk=6) and 256-bit (Nk=8) keys. 

[0039] The initial key 50, comprising four 32-bit words W(0), W(l), W(2) and W(3) is 
loaded into suitable memory locations 51o, 51i, 5l2, 5I3. In a conventional 




- cipher text output 39 of the encryption algorithm. 



-29- 



Application No.: 10/519,586 
Attorney' s Docket No. : 81131258 

implementation, the memory includes sufficient space, at 5 In to accommodate all 

words of the expanded key, once it is generated. 

[0040] Each new sequence of four words in the expanded key comprises a new round 
key and will be referred to as a "stretch". More generally, a stretch is W(i) to 
W(i+Nk) where i is an integer multiple of Nk, minus 1 (0, 3, 7 etc for Nk=4; 0, 7, 15 

for Nk=8). At the outset, the only stretch is the initial key 50, and the first task is to 
generate the first word of a new stretch, the decision box 53 thereby indicating path 
"yes". 

[0041] In the initial pass of the key expansion algorithm, the last word of the 
preceding stretch (5I3) is extracted (at 52) aad-^the bits are left shifted (step 54), 
and then transformed according to the AES key expansion algorithm using an S-box 
look-up 55. The S-box function is the same as that for the AES SubBytes transform 
20 (FIG. 1). The resulting 32-bit output 56 is transformed by XOR-combination 57 of 
the first eight bits only with a round constant Rcon 58 defined in the AES key 
schedule. The output 60 from this operation is then XOR-combined 62 with the first 
word of the preceding stretch (ie. 51o) and this result- W(4)-written 63 to memory 
at 5I4. 

[0042] In the second pass through the flow diagram, the next word W(5) of the 

second stretch is derived. This being the second word of a stretch, the left hand path 

of the flow diagram is taken, the newly generated word, W(4), at 5I4, being copied 

61 directly to the Wtmp buffer 60 ready for simple XOR-combination 62 with the 
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next word 51i of the initial key 50. The new generated word W(5) is written (at 63) 

to memory 5I5. 

[0043] The procedure repeats the left hand path a further two times, generating the 
last two words W(6) and W(7) of the second stretch, before recommencing the cycle 
for the third stretch, using the right hand path. 

[0044] In effect, it will be seen that each word of each new stretch is the XOR- 
combination of its immediately preceding word and the word in the corresponding 
position of the preceding stretch, with the exception of the first word in each stretch. 
For the first word in each stretch, it is a function of the immediately preceding word 
that is used, rather than the immediately preceding word itself, the function being 
executed according to steps 54-59 of FIG. 2. 

[0045] The principle deployed for 192-bit (Nk=6) and 256-bit (Nk=8) keys is the 
same, except that each stretch is respectively six words or eight words in length. 
[0046] Each successive group of four words is used as the round key for each 
successive round 28, 30 of the encryption procedure of FIG. 1. During decr5^tion, 

the round keys are applied in reverse order. 

[0047] In one aspect, the present invention roc osf niGC S recognizes that it is only 
necessary to retain in memory the Nk words of the original key together with the 
most recent Nk words of the expanded round key at any one time. The most recently 
generated four words (or, more generally, four successive words in the currently 
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held Nk words) are fed into the encryption engine at steps 23 or 33, while the held 

Nk words are used to generate the new stretch as described in FIG. 2. 

[0048] Providing that the new stretches are generated fast enough to keep up with 

the encryption engine, and maintained in ssnichronism therewith (within the 

tolerance of the difference of a stretch length (Nk=4, 6 or 8) and round key length 

(=4) so that the most recently generated stretch includes the round key which is 

currently required in the encryption engine, then very limited memory capacity and 

buffer requirements only need be provided. 

[0049] With reference to FIG. 3, the round key generator 100 comprises a RAM 
sector 101 that is divided into equal parts 102, 103, each part having a size of, for 
example, 4x32 bit words (for the 128-bit key algorithm), 6x32 bit words (for the 192 
bit key generator) or 8x32 bit words (for the 256 bit key algorithm). Throughout the 
following description, a round key generator 100 capable of handling a 256-bit key 
algorithm will be assumed, this being adaptable to accommodate processing of 
smaller key lengths. 

[0050] For convenience, the two parts 102, 103 will be referred to as the lower half 

103 and the upper half 102. The respective halves are referenced for read access by 

an OffSetHiRd pointer 105 via BHH ^multiplexer 104. For OfESetHiRd=0, lower half 

103 is read; for OffSetHiRd=l, upper half 102 is read. In the lower half 103 of the 

RAM 101, the initial encr3q)tion key 50 is stored in locations Wo to W? (ie. the first 

stretch W (0) . . . W (7) for Nk=8); in the upper half 102, the new calculated stretch, 
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e^g. W (8) . . . W (15) is stored in corresponding upper half locations Wo . . . W?. A 

pointer OffSetHiWr (not shown) may be used to point to the memory half being 

written to). As each successive stretch is generated and used in the encryption 

engine, the next stretch values (e.g. W (16) . . . W (23)) are calculated and 

overwritten into the upper half 102. 

[0051] The individual locations Wo ... W? (lower half) or Wi . . . W? (upper half) are 
referenced for read and write operations by an OffSetCnt counter 111 which is a 
three-bit counter that points to one of the word locations in the upper half and/or 
the corresponding location in the lower half. In general, the OffSetCnt counter 111 
is implemented as a modulo Nk up/down counter. 

[0052] A round key counter 110 maintains a count of the currently calculated round 
key (i.e. the current stretch). A state machine 106 maintains overall control of the 
round key generation process, and an expansion processor 107 performs the 
computation of the expanded round key values (words). 

[0053] When the encryption operation for the current plaintext block is complete, 

the procedure may be recommenced from the encryption key in the lower half 103. 

Alternatively, if a decryption operation is required, the first round key of the 

decryption cycle comprises the most recently calculated round key from the upper 

RAM half 102, which may be moved into the lower half, or read from the upper half. 

Successive decryption round keys are calculated in similar manner. At the 

completion of the decryption round key generation operation, the original 
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encryption key is returned and can be restored to or retained in the lower half of 

RAM 101 for a subsequent encryption operation. 

[0054] FIG. 4 shows a block diagram of the expansion processor 107. The expansion 
processor 107 comprises a first 32-bit register W, shown at 120, and a second 32-bit 
register Wtmp, shown at 121. Each register W, Wtmp can be filled directly from the 
RAM 101. A 32-bit, two input multiplexer 122 also allows the filling of Wtmp via a 
feedback line 123. The expansion processor 107 further includes special processing 
logic 150 for effecting the transforms RotateWord 154, SubWord 155, Rcon 158 as 
described in connection with transforms 54, 55, 58 in FIG. 2. XOR gate 157 is 
analogous to XOR gate 57 in FIG. 2. A 32-bit multiplexer 124 selects output from 
either the special processing logic 150 or direct from register Wtmp 121 to provide 
input to 32-bit wide XOR gate 162. 

[0055] At the start of an encryption operation, the initial key 50 (W(0) . . . W(7)) is 
loaded into RAM 101 into the lower half 103, positions Wo . . . W?. The first word 
W(0) of the initial key 50 is loaded into the buffer 120 from RAM 101 and the last 
word W(Nk-l) of the initial key 50 is loaded into buffer Wtmp 121. More generally, for 
successive rounds of encryption, W(i) is loaded into buffer 120, and the last 
calculated value of W(i+Nk) is stored in Wtmp 121. 

[0056] As defined with reference to FIG. 2, during a key expansion process for 
encryption, one the following equations applies to the generation of each new word 
W(i) of the expanded round key: 
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[0057] For all i except those below (i^e. no special processing 150), 

[0058] Rule 1: W(i) = W(i-Nk) e W(i-l) 

[0059] When i mod Nk=0 (the beginning of each stretch), 

[0060] Rule 2: W(i) = W(i-Nk) 0 SubWord (RotWord(W(i-l))) 0 Rcon(i/Nk) 

[0061] When i mod Nk=4 and Nk=8 (the middle cycle of each 8 word stretch), 

[0062] Rule 3: W(i) = W(i-Nk) © SubWord(W(i-l)) 
[0063] where: 

[0064] RotWord(Wtmp) is a bytewise rotation of Wtmp, 
[0065] SubWord is the AES S-box transform, 

[0066] Rcon is the round constant as defined in the AES standard, which is applied 
only to the first byte of the first word in each stretch, while the other bytes are 
passed unchanged, 
[0067] i=0 . . . 4Nr+3, 
[0068] i^e. 

[0069] i=0 . . . 43for Nk=4; 
[0070] i=0 . . . 51 for Nk=6 and 
[0071] i=0 . . . 59 for Nk=8. 

[0072] In other words, for the first word of each new stretch, the special processing 
of steps 54-59 is applied and W(Nk) is calculated as the XOR-combination 62 of W(0) 
from register 120 and the transformed W(Nk-l). For the middle word of each stretch 
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when Nk=8, the special processing only of step 55 is applied. For other words in each 

stretch, the contents of register 120 and register 121 are XOR-combined directly 

without the special processing of steps 54 to 59. 

[0073] With reference to FIG. 4 and FIG. 5 , register W is loaded with W(0) and 
register Wtmp is loaded with W(Nk-l) [e.g. W(7) for Nk=8]. Then the result of the 
calculation, being W(Nk), [e.g. W(8)], is output from XOR gate 162 and stored in 
both RAM 101 [e.g. at location Wo, upper halfl and in register Wtmp 121. Then, 
register W is loaded with W(l), while register Wtmp holds W(Nk), [e.g. W(8)]. Then W 
(Nk+1) [e.g. W (9)] is calculated and stored in RAM 101 [at location Wi, upper half! 
and in register Wtmp. 

[0074] In general, register W is loaded from RAM 101 with W(i), while register Wtmp 
holds the value of W(i+Nk-1). Then W(i+Nk) is calculated and stored both in RAM 
101, at position W(i+Nk) mod 8, upper half (i^e. new values are stored cyclically in the 
upper half 102), and in Wtmp. 

[0075] The key expansion process runs in parallel with the encryption processor 130 
which preferably works word-by-word rather than on blocks 128 bits wide. In this 
manner, the content of W can be passed directly to the encryption processor to be 
used immediately as input for the encrs^tion process. In the alternative, the 
encryption processor 130 may be coupled directly to access RAM 101 to retrieve the 
required words of the round key. This configuration allows more flexibility in the 
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relative timing of the cycles of operation of the encryption engine 130 and the 

expansion processor 107. 

[0076] For each cycle of operation, the new value of Wtmp is such that: 
[0077] Wtmp = Wtmp © W, except for the following cases: 
[0078] When i mod Nk=0, 

[0079] then Wtmp = SubWord (RotWord (Wtmp)) © Rcon (i/Nk) © W 
[0080] When i mod Nk=4 and Nk=8, 
[0081] then Wtmp = SubWord (Wtmp) © W 

[0082] During the key expansion process, the pointer OffSetHiRd 105 effectively 
points to a base word location in RAM 101 either in the upper half 102 and the 
lower half 103. Control of the read locations is implemented by this one-bit pointer 
which respectively selects the read half of the memory. Thus, during the first cycle 
of key expansion (during computation of the second stretch), the initial key words W 
(0) . . . W (7) are read from the lower half 102, i^e. the read flag 105 selects OffSetLo. 
During encryption key expansion, new values of the round keys are always written 
to the upper half 102. 

[0083] At the start, the following -imti-afaratieH -initialization settings apply: 
[0084] OffBetCnt=0, OffSetHiRd=0, OffSetHiWr=l, RndCnt=4Nr+3. 
[0085] The RAIVI 101 is read at address WNk-i, determined by OffSetHiRd and 
OffSetCnt (i.e. OffSetCnt+Nk-1), and stored in Wtmp. 
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[0086] Then the following procedure is executed Nk times: 

[0087] 1. Read the RAM at WofiSetCnt from the lower half, and store it in W. 

[0088] 2. Generate the next expanded key word and write it to Wtmp and to the 

memory at WofiSetCnt in the upper half 102. 

[0089] 3. Increment OfiSetCnt and decrement RndCnt. 

[0090] 4. Update Rmn only after the first cycle of the Nk cycles. 

[0091] All words of the initial key from the lower half 103 have now been used. 
OffSetHiRd is set to 1, so that all subsequent round key words are read from the 
upper half 102. For example, for Nk=8, the memory at address Ws contains W (8). 
[0092] Now, the following procedure is executed repeatedly until RndCnt = Nk-1. 
[0093] 1. Read RAM at OfESetCnt from the upper half (OffSetHi=l) and store it in 
W. 

[0094] 2. Generate the next Round Key word and write it to Wtmp and to the RAM at 

OfESetCnt in the upper half. 

[0095] 3. Update Rcon when O£[SetCnt=0 

[0096] 4. Increment OffSetCnt and decrement RndCnt. 

[0097] For Nk=4, the last calculation is W (43) = W (39) © W (42). OffSetCnt = 43 
mod 4 = 3. 

[0098] For Nk=6, the last calculation is W (51) = W (45) © W (50). OffSetCnt = 51 
mod 6 = 3. 
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[0099] For Nk=8, the last calculation is W (59) = W (51) 0 W (58). OffSetCnt = 59 

mod 8 = 3. 

[0100] So, independent of Nk, the last Round Key word is always stored at 
OffSetCnt=3. 

[0101] At this point, the last Nk round key words are used by the encryption 

processor 130, but there are no more Round Key words to be generated by the 
expansion processor. Thus, the following procedure is executed repeatedly until 
RndCnt=0: 

[0102] 1. Read the RAM at WoflSetCnt from the upper half and store it in W. 
[0103] 2. Increment OffSetCnt and decrement RndCnt. 

[0104] It will be noted that the lower half 103 of the RAM 101 now contains the 
initial encryption key (Nk words), and the upper half 102 of RAM now contains the 
final Nk words of the expanded key. The final Nk words of the expanded key are the 
first Nk words of the decryption key. 

[0105] Thus, the RAM now contains the initial round key for encryption and the 

initial round key for decryption. Therefore, it does not matter whether the next 
operation to be performed by the cryptographic engine is an encryption operation or 
a decryption operation-the expansion processor can commence key expansion 
starting from either the upper half 102 or lower half 101. 

[0106] During decryption, as depicted in Fig. 5, the Encryption Round Keys are 
applied in reverse order. 
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[0107] Therefore, in operation of the present invention, during decryption it is 

necessary to generate W (i) from W (i+Nk) and W (i+Nk-1). 

[0108] The inverse of the key expansion process requires that: 

[0109] Rule 1: W (i-Nk) = W (i) © W (i-1) 

[0110] for all i, except: 

[0111] Rule 2: W (i-Nk) = W (i) © SubWord (RotWord (W (i-1))) © Rcon (i/Nk) 
[0112] when i mod Nk=0, and 

[0113] Rule 3: W (i-Nk) = W (i) © SubWord (W (i-1)) 
[0114] when i mod Nk=4 and Nk=8. 

[0115] Note, that all W (i-Nk) and W (i) have interchanged places, but the complex 
second input is the same as for encryption. 

[0116] Taking Nk=4 as an example, the last W that was generated during 
encryption was W (43). During decryption key expansion, the first time W is loaded, 
it is loaded from RAIM 101; thereafter subsequent W may be obtained from WtmpT_by 
means of multiplexer 125. 

[0117] Thus, the first step is to load W with W (43) (found in the upper RAM half 
102 at Wii, OffSetCnt 3) and Wtmp with W (42) (found in the upper RAM half 102 at 
Wio, OffSetCnt 2). Then, we calculate W (38) = W (42) © W (41) and write the result 
to RAM 101 in the lower half 103 at Ws. The content of Wtmp is then shifted to W, 
which then holds W (42) and Wtmp is loaded with W (41). 
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[0118] In the next cycle, we calculate W(39) = W(43) ® W(42) and write the result to 

RAM 101 at Wi and we shift the content of Wtmp to W, which then holds W(41) and 

we load Wtmp with W(40). This cycle is repeated for successive W. 

[0119] In general, register W is loaded from RAM (or from Wtmp) with W(i) and 

register Wtmp is loaded from RAM with W(i-1). Then W(i-Nk) is calculated and stored 

in lower RAM half at position Wi mod 8 and the content of Wtmp transferred to W. 

[0120] The decryption key expansion process runs in parallel with the decryption 

processor which preferably works word-by-word rather than on blocks 128 bits wide, 

i.e. the content of W is also passed to the decryption engine 140 for use as input for 

the decryption operation. 

[0121] At the start, the following initiahsa ti o n initialization settings apply: 

[0122] Of£SetCnt=3, OffSetHiRd=l, OfiSetHiWr=0, RndCnt=4Nr+3. 

[0123] The RAM 101 is read at address OffSetCnt [Of[SetCnt=3, giving W (4Nr+3), 

Gig. W (43) for Nk=4] and stored in W. 

[0124] Then, the following procedure is executed Nk-1 times: 

[0125] 1. Read the RAM at WoffSetCnt-i mod Nk from the upper half and store it in 
Wtmp [W (42), W (41) and W (40) for Nk=4]. 

[0126] 2. Generate the next expanded key word and write it to RAM at OffSetCnt in 

the lower half [W (39), W (38) and W (37) for Nk=4]. 

[0127] 3. Transfer the content of Wtmp to W 

[0128] 4. Decrement OffSetCnt and decrement RndCnt. 
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[0129] All words from the upper half have now been used. OffSetHlRd is set to 0, so 

all following key words are read from the lower half. For example, for Nk=4, the 

memory at address 3 in the upper half contains W (39). 

[0130] Now, the following procedure is executed repeatedly until RndCnt=Nk-l. 
[0131] 1. Read the RAM at Wo£ESetCnt-i mod Nk from the lower half and store it in 
Wtmp. 

[0132] 2. Generate the next Round Key word and write it to Wtmp and to the 

memory at OffSetCnt in the lower half. 

[0133] 3. Transfer the content of Wtmp to W. 

[0134] 4. Update Rcon when OffSetCnt=0 

[0135] 5. Decrement both OffSetCnt and RndCnt. 

[0136] At this point, the last Nk round key words are used by the decryption 
processor 140 but we do not need to generate more Round Key words. Thus, the 
following procedure is executed repeatedly until RndCnt=0: 

[0137] 1. Read the memory at Woasetcnt-i mod Nk from the lower half and store it in 
Wtmp. 

[0138] 2. Transfer the content of Wtmp to W. 
[0139] 3. Decrement both OffSetCnt and RndCnt. 

[0140] Note that the very last read may be omitted, since it will not be used. 

[0141] In a preferred embodiment, the Sub Word function 55, 155 in the key 

expansion process may be implemented by the same hardware as that which 
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implements the SubBytes transform 20, 31 of the encryption/decryption processes. 

In practice, it is found that this has minimal if any delaying effect on the 

encr5T)tion/decryption processes. Only every Nth round, will the key expansion 

processor compete with the encryption/decryption process for the same hardware. 

[0142] Where the key expansion and cryptographic processes are in lock step on a 

word-by-word basis, the key expansion engine and the cryptographic engine will 

wait for each other before going to the next round, and every Nth round they have 

also to wait for separate access to the S-Box transform functions. However, while 

the cryptographic engine performs the ShiftRow transform 21 or the MixColumn 

transform 22, the key expansion processor can use the S-Box hardware. 

[0143] The minimum amount of memory 101 required for efficient bi-directional 

operation is 2Nk words: one half (Nk) to store the encryption key and the other half 

to store the decryption key. 

[0144] During encryption, as shown in Fig. 4, the first Nk words are taken from the 
encryption (lower) half. All generated round key words are written to the decryption 
(upper) half. At the end of encryption, the decryption (upper) half holds the 
decryption key. 

[0145] During decryption, as shown in Fig. 5, the first Nk words are taken from the 
decryption (upper) half, which is in effect the "initial key" for decryption. All 
generated round key words are written to the encryption (lower) half. Although that 
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means that the encryption key is temporarily overwritten, after decrjrption, the 

encryption key is regenerated. The decryption key is not overwritten. 

[0146] Thus, after a first encryption process, the key expansion processor can 

immediately generate an expanded encryption key or an expanded decryption key, 

by selecting to start either from the lower half 103 or the upper half 102. For first 

time operation, with a new key, it is necessary to perform an encryption operation 

in order to generate the decryption key. 

[0147] It is possible to reduce the amount of memory to as little as Nk words. 
However, this is less efficient in that if a number of consecutive encrj^tion or 
decryption operations are required, each one must be interspersed with a dummy 
decryption or encryption operation to regenerate the initial encryption (or 
decryption) key. In general, this is less desirable. 

[0148] State machine 106 controls the various registers and counters as follows, 
applicable to all cases of Nk=4, 6 or 8. 

[0149] The 3-bit up/down counter OfESetCnt 111 points te-the address to each half of 

the memory. It counts up during encryption; when it reaches Nk-1, then it is reset to 

0 again. It counts down during decryption. When it is 0, it is reset to Nk-1. 

[0150] When O£fSetCnt=0, then Rule 2 for W (i) applies. When OffSetCnt=4 and 

Nk=8, then Rule 3 applies. For all other values of OflSetCnt, Rule 1 applies. 

[0151] The 1-bit variable OffSetHiRd is set to point initially (for the first Nk reads) 

to the lower RAM half during encryption, then to the upper RAIVE half 102 for all 
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subsequent reads. During decryption, OffSetHiRd is set to point initially (for the 

first Nk reads) to the upper RAM half then to t he -te - the lower RAM half 103 for all 

subsequent reads. The 1-bit variable OffSetHiWr is set to point to the upper RAM 

half 102 for all writes during encryption, and to point to the lower RAM half for all 

writes during decryption. The 6-bit down counter RndCnt 110 counts the number of 

rounds. 

[0152] With reference again to FIG. 2, the round constant Rcon 58 must be updated 
(step 59) each cycle, i.e. after each use thereof. 

[0153] For the first cycle, Rcon [1] =1. After each cycle, the value of Rcon is updated 
such that: 

Rcon [i/Nk] =xtime (Rcon [i/Nk-1], 

[0154] i.e. the previous value of Rcon is left-shifted, and when the most significant 
bit=l then the hex value IB is added to Rcon. 

[0155] According to the AES specification, the function Rcon [i/Nk] is called when 
[0156] i mod Nk=0, while Nk < i < Nb (Nr+l). 
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Nk 


Nb 


Nr 


Nb(Nr+l) 


4 


4 


10 


44 


6 


4 


12 


52 


8 


4 


14 


60 



[0157] For Nk=4, Rcon [i/Nk] is called for at i=4, 8, 
value=36h. 

[0158] For Nk=6, Rcon[i/Nk] is called for at i=6, 12, 
value=80h. 

[0159] For Nk=8, Rcon[i/Nk] is called for at i=8, 16, 
value=40h. 



. . . 40, i.e. 10 times. The last 



... 48, i.e. 8 times. The last 



. . . 56, i.e. 7 times. The last 



i/Nk 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


Rcon[i/Nk] 


01 


02 


04 


08 


10 


20 


40 


80 


IB 


36 



[0160] In a preferred embodiment, the Rcon function 58, 59 is implemented as an 8- 
bit shift register, which can shift both left (for encr5rption) and right (for decryption). 
The shift register can be preset to the following values Olh, IBh, 36h, 80h and 40h. 
[0161] For encryption, as shown in Fig. 4, i t is preset to Olh. It shifts to the left, 
except when it reaches 80h, at which point it is preset to IBh. 
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[0162] For decryption, as shown in Fig. 5, it is preset to 36h for Nk=4, 80h for Nk=6 

and 40h for Nk=8. It shifts to the right, except when it reaches IBh, at which point 

it is preset to 80h. 

[0163] Thus, the shift register effectively has three control inputs. A first control 
input cffocts causes a left shift (bit rotation) of the register, which is used during 
each cycle during the encryption key expansion. A second control input efifeete 
causes a right shift (bit rotation) of the register, which is used during each cycle 
during the decryption key expansion. A third control input causes presetting of the 
register with one of a number of predetermined values, according to the current 
value of the register, and the direction (encryption or decryption). 
[0164] It will be noted, in a general sense, that the present invention provides a 
method of generating successive round key words of an expanded key, from an 
initial key, which method maintains the generated successive round key words in 
memory substantially only as long as they are required for use in the generation of 
successive round key words and for use in the parallel operation of a cr3T)tographic 
process. 

[0165] In the preferred embodiment, the initial key words are also maintained in 
the memory. 

[0166] Other embodiments are intentionally within the scope of the accompan3dng 
claims. 
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